Using two-level morphology to transcribe Swedish names
نویسنده
چکیده
Names are difficult to handle for normal letter-to-sound rules, since these usually are designed for ordinary words. The structure of Swedish names differ from ordinary words but their multi-morphemic structure make them suitable to analyse with a morphological analyser. The paper presents the work on names from the Swedish telephone directory, as part of the ONOMASTICA project [7], including a brief study of the structure of Swedish names. The speech communication group at KTH have developed a system where a morphology analyser is used together with a set of rules to transcribe ordinary Swedish words. This paper will describe the work done to extend this system to cope with names as well. The paper shows that the approach of transcribing Swedish names with the Two-level Morphology analyser (TWOL) is appropriate.
منابع مشابه
Transcribing Names with Foreign Origin in the Onomastica Project
This paper studies the problem of transcribing foreign names. The transcriptions of first names in five languages have been studied to show examples of how this problem has been dealt with in the Onomastica Multi-Lingual Pronunciation Dictionary of European names. The paper describes this dictionary and the methods used to do the automatic transcriptions for the Swedish part. INTRODUCTION Names...
متن کاملImplicit Discrimination in Hiring: Real World Evidence
Implicit Discrimination in Hiring: Real World Evidence This is the first study providing evidence of a new form of discrimination, implicit discrimination, acting in real economic life. In a two-stage field experiment we first measure the difference in callbacks for interview for applicants with Arab/Muslim sounding names compared to applicants with Swedish sounding names using the corresponden...
متن کاملPart of Speech Tagging for Text Clustering in Swedish
Text clustering could be very useful both as an intermediate step in a large natural language processing system and as a tool in its own right. The result of a clustering algorithm is dependent on the text representation that is used. Swedish has a fairly rich morphology and a large number of homographs. This possibly leads to problems in Information Retrieval in general. We investigate the imp...
متن کاملThe Swedish Core Language Engine
The paper describes a Swedish-language customization (S-CLE) of the SRI Core Language Engine, which has been developed at SICS from the original English-language version by replacing English-speciic modules with corresponding Swedish-language versions. The S-CLE is intended to be used as a building block in a broad range of applications, such as database query system, machine translation system...
متن کاملDefiniteness Morphology in Swedish Determiner Phrases
In Swedish determiner phrases definiteness can be realised both pre-nominally with a definite article and postnominally with a definite suffix on the head noun. This paper discusses the distribution of definiteness morphology in a number of morphosyntactic contexts. Separate patterns of definiteness marking emerge when considering the following morphosyntactic contexts: DPs modified with a prep...
متن کامل